3 August 2022

How ethical data sourcing is helping to tackle child sexual abuse online

An Australian project shows how AI can make searching for and identifying illegal material a less traumatic process.

When we envisage where artificial intelligence might be heading next it is often with trepidation. After a Google engineer was fired in June for alleging that an AI he had created was sentient, the idea that conscious machines with the capacity to think and feel could soon exist seems imminent. At the same time, a world where law enforcement uses AI to invade our privacy is a cause of great concern – a 2019 survey from the Ada Lovelace Institute found that more than half of people (55 per cent) want the government to impose restrictions on the police’s use of facial recognition technology.

But what is often neglected is how AI might be used to enhance our safety. One area where it could be of benefit is in tackling child sexual abuse and exploitation. The increase in such material online is shocking. According to the Internet Watch Foundation, there has been a 15-fold increase in child sexual abuse content, while the number of sexual images that children take of themselves has increased by 374 per cent compared to just before the pandemic.

The longevity of the internet is another cause for concern. Such insidious material lives on in chatrooms, private messaging apps, illicit websites and on cloud platforms. The Canadian Centre for Child Protection found in a 2017 survey that 67 per cent of abuse victims report that the distribution of their images keeps impacting them. Nearly 70 per cent indicated that they worry constantly about being recognised by someone who has seen images of their abuse.

Governments, law enforcement and child protection services will need to take the leading role in tackling this problem, and AI could help them. One project in Australia is using ethical data sourcing to reduce the prevalence of child sexual abuse. The “My Pictures Matters” initiative has been created by researchers at AiLecs Lab, a collaboration between Monash University and the Australian federal police. It involves asking adult members of the public to upload consensual, fully-clothed pictures of themselves as children (aged 0-17) onto the project website, to eventually crowdsource 100,000 “happy” childhood pictures.

[See also: “AI is invisible – that’s part of the problem,” says Wendy Hall]

A machine learning algorithm will then be used to analyse the pictures in order to learn what a child looks like at different stages of childhood. This algorithm can then be used in future to source images of child sexual abuse when a suspected offender’s laptop is seized, distinguishing abusive images from benign ones. Ultimately, it will be able to scan files and find indecent imagery much more quickly than a human could, according to the researchers, and will be able to streamline referral to the police while minimising repeated exposure to indecent material.

Unlike most child sexual abuse material databases, the data project consists of safe and consensual imagery. Nina Lewis, project lead, tells Spotlight that this aims to “facilitate informed and meaningful consent” for the use of children’s images in machine learning research.

The UK has its own child sexual image database (CAID), which helps the police to identify victims and offenders. Set up in 2014, police forces have reported that it significantly speeds up the process of reviewing images. “Previously, a case with 10,000 images would typically take up to three days,” said one of the first forces to use it. “Now, after matching images against CAID, a case like this can be reviewed in an hour.” The project received a further £7m funding boost in 2020. To improve efficiency, the Home Office worked with technology company Vigil AI to use an AI tool that speeds up the process of identifying and classifying abusive images.

While the UK’s database has the same purpose as the My Pictures Matter project, the ethical implications differ. Rather than encouraging consenting adults to share their own childhood pictures, CAID relies on child abuse imagery to function. In the six months leading up to January 2021, 1.3 million unique images of such imagery were added to the database. The retention of this imagery can cause further distress and trauma to victims, and also to police or online moderators tasked with reviewing the content.

Australia’s pilot project demonstrates how ethical AI might be used to improve the process of identifying such material. In the UK, the Online Safety Bill is currently making its way through parliament and aims to make the internet safer, especially children and teenagers, by putting the onus on social media websites and tech platforms to tackle content such as child sexual abuse material. They will have an obligation to monitor their websites, take down offensive material, and even prevent people from seeing it in the first place. The AILecs Lab project shows how a more consensual approach to data collection could make the process easier for both victims and moderators.